A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights.
1. Field of the Invention
The present invention relates to the field of document summarization which is otherwise known as automatic abstracting wherein an extract of a document (i.e., a selection of sentences from the document) can serve as an abstract.
2. Background of the Invention
The advent of the personal computer and modern telecommunications has resulted in millions of computer users communicating with each other around the globe. One of the primary uses of such computers by such users is accessing the vast store of digital information which has been created over the last several decades. Further, additional digital information is created daily due to both the conversion of information previously unavailable digitally and the large amount of new information created by an ever increasing computer user population.
One concern with this vast, ever increasing amount of digital information is the time it takes to read even a small portion of it. Whether one is reviewing a previously arranged set of documents, as in the case of reading an on-line newspaper or magazine, reviewing the results of an electronic search, or scanning documents stored on a large hard disk drive of a personal computer, it can still take considerable time to read more than a minimal amount.
What is needed, therefore, is a facility which provides a summary or abstract of each document. Having a summary of each document allows the reader to determine whether that document is of interest, and hence, reading more of the document might be desirable. Conversely, reading the summary of a document could suffice to sufficiently inform the reader about the document, or instead, could indicate to the reader that the particular document is not of interest. No matter the result, a good document abstract mechanism could be quite valuable in the modern digital world.
However, a good document abstract mechanism means more than merely providing an automatic summary of a document. Prior approaches to document summarization or xe2x80x9cAutomatic Sentence extractionxe2x80x9d, as discussed on pages 87-89 of the xe2x80x9cIntroduction to Modern Information Retrievalxe2x80x9d by Salton and McGill, Copyright 1983, incorporated herein by reference in its entirety, have yet to yield abstracts xe2x80x9cin a readable natural language contextxe2x80x9d which xe2x80x9cobey normal stylistic constraints.xe2x80x9d Salton and McGill further state that xe2x80x9c[r]eadable extracts are obtainable without excessive difficulties, but perfection cannot be expected within the foreseeable future.xe2x80x9d
One difficulty with prior document abstract mechanisms, even when overcoming many of the natural language barriers, is that the system or mechanism can never know for certain whether the user is receiving as much or as little of an abstract as they would like. In other words, no matter how well the mechanism can determine which portions of the document to include in the summary or abstract, the mechanism can never automatically include just the right amount of abstract to always please the user. This can be due to different users"" interest levels, different user""s reasons for reviewing the document, and even time or situation varying interests of the same user. As such, what is needed is not necessarily a better abstracting algorithm as much as a mechanism which allows the user to interactively specify whether the present abstract is sufficient or, instead, whether more or less of the original document should be included in the abstract or summary.
The present invention utilizes an interactive control which allows the user to specify whether more or less of the original document should be included in the document summary. Allowing the user to interactively control how much of the original document gets included in the summary facilitates rapid review of documents in which the user has little interest as well as review of up to the entire document in the case of great user interest. Furthermore, such interactive control allows the user to expand and contract summarized documents at will, thus freeing the user to focus on the content of the summarized document rather than on trying to determine what amount or percentage is sufficient or how the underlying abstracting mechanism operates.